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1. The general problem 
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1.1 Genetic classification for the masses 
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Over-generalizing a bit, it seems to me that there has been a 
polarization in the field of genetic classification methodology- 
in recent years. The extreme ends of the scale are: 

The "botanical approach": one must reconstruct "from the 
bottom up" through all sub-families to the phylum level. 



An obvious problem with this approach is this: how does one 
know in advance what the detailed structure is? It is obvious that 
work must proceed "from the top down and from the bottom up" 
simultaneously- e.g., you cannot do Italic correctly without 
knowing something about Germanic and Indo-European. 



Other extreme statements are also made by some "traditional- 
ists", e.g. only morphological innovations are legitimate evidence 
for genetic classification, one must not only find regular corres- 
pondences, but also account for all exceptions to regular corresp- 
ondences (W. P. Lehmann- source not at hand) . Such dogmatic cond- 
itions set requirements which have been only ideals even for the 
most-studied fields such as comparative Indo-European. 



. The "fishbowl approach": there are "global etymologies", 
relating all world languages, transparent by inspection to all but 
Comparative/Historical (C/H) linguists (the parallel with the 
Emperor's New Clothes, visible to all but fools, is striking) . 

Language classification is really quite easy and can be done by 
anyone! Ruhlen 1994: viii, quotes Greenberg as follows: "to really 

screw up classification you almost have to have a Ph.D. in histor- 
ical linguistics. Ordinary folks, with no training, inevitably ar- 
rive at the correct solution". This is made possible by "the meth- 
od of Multilateral Comparison" . 
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1.2 The cloudy fishbowl 

But "Multilateral Comparison" (MLC ) aka "Mass comparison" is 
not really a method of doing genetic language classification: it 
is a pre-theoretical step preceding Comparative/Historical (C/H) 
Reconstruction, which is the real method. This is admitted even by 
the practitioners of MLC (v. Ruhlen 1994: 130 and Greenberg 1987: 
27), but they state that what they are doing is classification 
preceding the comparative method, which presumably does something 
else. One wonders "why bother with the C/H method?" if classific- 
ation without it is so easy (and in fact, MLC practitioners do not 
bother with it) . But one does not usually elevate pre-theoretical 
inspection of the data to the status of method. For example, do 
syntacticians set forth their introspective musings as method be- 
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fore undertaking the actual syntactic analysis in some theoretical 
framework? 

An explanation of how MLC works, extracted from Ruhlen's book 
for the layperson (1994: 8-9) is as follows: "The classification 

of languages into different families is based on discovering words 
[sic] in different languages which are similar in sound and mean- 
ing... Throughout this book you will be shown tables that list 
words in different languages, and you will be asked, on that bas- 
is, to classify the languages into families ... Your task is to 
classify these languages into language families simply on the bas- 
is of perceived similarities ... since the meaning of all the forms 
is the same, you need concern yourself only with deciding which 
forms are similar in their constituent sounds". 

In the book at hand, meanings are said not to differ within the 
lists given, but in practice in MLC work (such as Greenberg 1963), 
similarities in both sound and meaning are judged by the invest- 
igator, amateur or otherwise. 

A classical statement of why MLC is presumed to work is given 
by Greenberg in an early paper (1953: 271-2) and repeated in his 
1963 (3). If the probability of acceptable similarity by chance 
between languages A and B on a given item is p, then the probabil- 
ity that a third language C has an acceptable similar item by 
chance is p^, of a fourth language D having a similar item by 
chance is p3 , etc., to p n ~ ^ for n languages. The probability rap- 
idly approaches zero, e.g. at 5% chance similarity, for four lang- 
uages, p^ = .000125 (1 in 8000). 

But there is an overwhelming fallacy in applying this reason- 
ing. The argument as stated applies to the case in which a sim- 
ilarity has already been noted in two or more languages. But the 
situation of interest is actually one in which one begins with a 
set of languages, say 12 of them, and then looks at a particular 
item on the word-list, e.g. 'hand'. Starting with language 1 on 
the list, one then looks at language 2: the probability that a 
match will no t be found is 19/20 or .95. The probability that a 
match is not found with Ll vs. L2 or Ll vs. L3 (if all the L's are 
independent of each other) is then .95^ or .90 (rounding off) . 

This continues till L12 has been reached and the probability that 
no match is found in any of the 11 cases is .95^- or about .57. 1 

In other words, by just looking at Ll vs. L2...L12, there is 
already a probability of about .43 (=l-.57) that at least one 
match is found even if all L's are independent of each other. But 
there are still 55 more ways to choose pairs out of the 12 lang- 
uages, starting with L2 vs. L3...L12, then L3 vs. L4...L12, etc. 

If a pair is found, then there are 10 ways it can be extended 
to a triple by looking at the remaining languages. In fact, while 
p n is decreasing, the number of ways one can find 2, 3, 4, etc. 
sets of languages out of a given number (e.g. 12) increases to a 
maximum at half the number (in this case 6) : there are 66 ways to 
choose 2 (or 10) out of 12, 220 ways to choose 3 (or 9) out of 12, 
495 ways to choose 4 (or 8) out of 12, 792 ways to choose 5 (or 7) 
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out of 12, 924 ways to choose 6 out of 12. The mathematics is much 
more complicated than the simplistic "powers" argument implies and 
it is clear that the chance of getting an n-ary comparison is much 
larger than the simple p n because there are so many ways of 
choosing pairs, triplets, etc. 

Two practical objections also arise: 

(i) rarely are more than two or three languages involved in the 
comparisons (see in 4 below for examples from Greenberg 1963); 

(ii) MLC practitioners allow themselves wide latitude, uncon- 
strained in any principled way, as to what is acceptable as "sim- 
ilar" in both sound and meaning; this greatly multiplies the chan- 
ces of accepted matches (i.e. it makes p much larger than 5% or 
such values- Greenberg 1953 used 8% as an illustration- in fact, 
it can easily approach 1 -certainty- in such unconstrained cases) . 
The principled phonological constraints are provided by the comp- 
arative method. Constraining semantics is an unsolved and perhaps 
unsolvable problem, but there are practical means such as accept- 
ing only semantic shifts documented in the family or area or a 
fixed small number of synonyms. 

It would be difficult to set up 12 items in which no two look 
"similar" by the standards used by the MLC practitioners. Let us 
look at the word for 'hand' in the 12 representative N-S languages 
I use in 2.3 below: 



A: 


Gao ka(m)ba 


B: 


Kanuri musko 


C: 


Aiki kara 


D: 


For -o ija 


*E : 


: PE.S. asi 


*F : 


: PC.S. sili 


G: 


Berta daba 


H: 


Kunama kona 


I : 


Twampa med 


J : 


Sai ela 


K: 


Ik k w eta 


L: 


Krongo niiso 



The sets A, D, H and C, J are likely choices by the MLC method, 
especially bearing “movable k" in mind (the phenomenon noted by 
Greenberg [1963: 116, 132] that in N-S nominals sometimes appear 
with or without k- , from language to language or even within the 
same language) . More adventurous sorts might include *E and *F and 
even perhaps L as a set; even A and G or B and L cannot be ruled 
out as possibilities. The "MLC Method" does indeed make it easy 
even for the lay person to relate languages! 

In my reconstruction work, the C, *E, G, H, and K items above 
occur singly in five different isoglosses, while *F and J occur in 
my #191 (along with Fp (h)eli and I Opo 'elbow' sil-) . Thus, MLC 
would lead to completely wrong results in this instance. 

Refutation of MLC must be statistical because the question is a 
statistical one: it is one of probabilities, not possibilities (as 
stated by Franz Rottland at the Prague Round Table on Lexical Dif- 
fusion in Sub-Saharan Africa, August, 1993). The question is “How 
much is enough?" either in terms of numbers of positive instances 
or in terms of "quality" of examples. Quality itself must be made 
objective, i.e. numerical, in some way if we are to pass beyond 
mere subjective judgments. It is easy to accept genetic related- 
ness of English and German or Italian and Spanish by inspection 
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(recognizing that plenty of high-quality comparisons can be pro- 
duced) , but the interesting cases are precisely those for which 
such easy judgments are not possible. 

The basic question is: what statistical measure can be applied 
to state that languages A and B are likely to be genetically re- 
lated at a given confidence level? The test developed by Donald 
Ringe (1992, 1993) provides one answer to this question. It is 
not, as misrepresented by critics, a method of doing C/H linguist- 
ics and therefore it is also not a replacement for standard meth- 
ds. The key to utilizing the test is recognizing that every lan- 
guage has its own set of phoneme frequencies and therefore every 
pair of languages has a different set of paired phoneme frequen- 
cies. Thus, comparison must be binary until or unless someone de- 
velops the mathematics to do n-ary comparisons under these condit- 
ions. (But is doubtful that this latter is a desirable goal, since 
n-ary comparisons can always be decomposed into binary ones) . 

Ringe has applied the Test or reasoning derived from it at the 
99% level to Indo-European (with a positive result), Indo-Euralic 
(very likely related but perhaps not reconstructable) , Illich 
Svitych ' s "Nostratic" (negative), and Greenberg's "Amerind" (neg- 
ative) and to other problems which he will discuss in his paper in 
this session. 

2 . African examples 

Time constraints do not allow me to present detail in this 
section. I have presented the Omotic and Nilo-Saharan results 
elsewhere (Forth, and 1994 unpub. respectively) and will present 
the East Sudanic results at the 6th Nilo-Saharan Conference. The 
latter will be a study of what the Ringe Test reveals at 94% level 
for the nine proposed branches of East Sudanic with a detailed an- 
alysis of how the results compare with MLD and C/H results for the 
same units. 

3. Greenberg's African Classification 
3 . 1 Background 

Greenberg's genetic classification of African languages was a 
major breakthrough in a field which was somewhat chaotic. It was 
first a series of articles (Greenberg 1949-54 in Southwestern 
Journal of Anthropology, v. full references in G1963), then col- 
lected in a small volume (1955). The final report (1963) was re- 
printed unchanged in 1966 and 1970, incorrectly referred to as 
"2nd and 3rd editions". It was at first largely rejected, espec- 
ially in Europe, for mainly invalid reasons: rejection of genetic 
classif icaton as such, non-acceptance of applying the method to 
"exotic" languages, reluctance to abandon preconceptions such as 
the racist "Hamitic" concept or the idea that Bantu is an archaic 
family . 

While giving Greenberg full credit for his accomplishment, one 
should not over-emphasize the degree of chaos pevailing in the Af- 
rican classification field as of 1949 when his articles began to 
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appear nor the originality of his classification. Much of the 
groundwork had been laid by predecessors such as Westermann and 
KOhler, whom Greenberg properly acknowledges (e.g. see Ruhlen's 
survey history in his 1987: 76-124) . 

Many of the problems were relatively small-scale, e.g. the 
position of Fula and Hausa, both having the "isolated-language" 
mystique at the time, the position of Bantu, which some thought 
must be a major unit since it is so widespread and has so many 
varieties. Others were the hangover of the racist "Hamitic"con- 
cept, which wrongly brought physical type and cultural traits such 
as pastoralism into linguistic classification and needlessly com- 
plicated the placing of such languages as Masai. Others were back- 
ward concepts such as indiscriminate mixing of languages in 
"semi-" and ”-oid" types, undocumented massive borrowings, and 
merging of typological and genetic classification. Others were the 
more prosaic lack of data in Chadic other than Hausa, southwest 
Ethiopia, and many other areas. 

Greenberg's accomplishment was to arrive at the first contin- 
ent-wide genetic classification based on sound-meaning corespond- 
ences and cutting through the mass of misconceptions prevalent at 
the time. Its positive points far outweighed its shortcomings. 

3.2 Criticisms 

But there were and are legitimate criticisms of the Greenberg 
classification. I will refer herein mainly to Nilo-Saharan, which 
was and is the most controversial unit. 

(1) Errors. Many which appeared in the 1955 collection were 
maintained in the 1963 revision. For example (as pointed out by 
Winston 1966), the little table of forms (1955: 109) which was to 
serve as an example of the method later known as Multilateral Com- 
parison (MLC ) contains several egregious errors. As Winston indic- 
ates and I amplify, this does not inspire confidence in the method 
or its application. The most serious error is listing under the 
gloss 'hand' items in Bantu and Kanuri which mean 'head' (also I 
now find in Teda, Zagawa, and Berti) . Worse yet, the corresponding 
Efik term really means 'father' and is compared positively to the 
Bantu term. The table was repeated with the same errors in 1963 
(p. 4, on which the method is referred to as "mass comparison"). 

Two other instances will suffice for now. On p. Ill (line 3 
from bottom) of 1963 (not found in the 1955 version) , a crucial 
argument is reversed with the wording: "7. Third person subi ect k- 
indeoendent constructions . " Context shows that what was meant is 
" . . .k^ in dependent . . . " . In the word list for East Sudanic, I 
counted only 7 attestatons of family 6 Temein out of 131 items 
(more on this below) and then found that one of them (no. 54 on p. 
100) is really family 5 Nyima, mislabeled "(6) Nyima". 

(2) Uneven documentation. Using East Sudanic (E.S.) as an ex- 
ample, there are in Greenberg's formulation 10 families: 1. Nub- 
ian, 2. Surmic (his Murle, etc.), 3. Nera (his Barea) , 4. Jebel 
(his Ingassana) , 5. Nyima, 6. Temein, 7. Tama (his Merarit, etc.), 
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8. Daju (his Dagu, etc.), 9. Nilotic, 10. Kuliak (his Nyangiya) . 
The 131 lexical items (some with sub-varieties) include about 400 
citations by families. Of these, half -92 Nubian and 109 Nilotic- 
are accounted for by two families. Temein has only 6 (with the er- 
roneous no. 54 corrected) and Nyangiya only 10. 

The others are E2 (abbreviation for Family 2): 45, E3 : 37, E4 : 
21, E5 : 17, E7 : 44, E8 : 19. None equals as much as half the cit- 
ations of Nubian or Nilotic, which are by far the best-attested 
families in E.S. In fact, the Nubian and Nilotic citations are so 
high partly because items are cited which occur in single lang- 
uages or sub-families of these entities. Using "common Nubian or 
Nilotic" or better yet reconstructed forms (which Greenberg re- 
jects in his methodology) would largely redress the imbalance. 

The same remarks apply to the morphological elements given in 
support of East Sudanic, but I will not go into this here. 

Another aspect of this problem is the fact that the "mass com- 
parison" is based on mainly 2, 3, or 4 families out of the possi- 
ble 10 in E.S. In fact, the numbers of families involved are as 
follows: two: 52, three: 46, four: 24, five: 5, five: six: 2, sev- 
en: 1, eight: 1 (total 131). Thus, double and triple instances ac- 
count for most of the evidence and only nine cases out of 131 in- 
clude more than four families out of the ten. I will return to 
this below. 

For N-S, the imbalance is not so marked: as expected, the vast 
Chari-Nile group is over-represented and the single language For 
is under-represented. However, the main problem here is the skew- 
ing introduced by the very presence of the invalid "Chari- Nile" 
grouping . 

(3) Data do not support results. As Goodman (1970) and Bender 
(1976) show, Greenberg's' "Chari-Nile Family" consisting of East 
Sudanic, Central Sudanic, Berta, and Kunama is not justified by 
the data. I found East Sudanic itself to be coherent, whereas 
Goodman did not. This is the main error; others are minor, given 
the status of the comparative data available at the time. 

3.3 Success despite faulty methodology and application 

Others have pointed out similar problems with Greenberg's data 
and analysis of the other phyla (e.g. Leslau 1958 for Afrasian 
etymologies). Nevertheless, Greenberg's classification has won out 
and forms the basis for most Africanist work today. In my view, 
his Nilo-Saharan was a brilliant accomplishment. My own intensive 
work of two decades leads me to the following revision, based on 
morphological innovations (Bender 1989, 1991). It is best seen in 
diagram form (next page) . Changes from Greenberg are given 
immediately following the chart. 
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N-S 



A B K 
" Outliers " 



Satellite-Core Group 



A Songay 
B Saharan 
C Maba 



C D F G H Core Group 



D For 

E East Sudanic 
F Central Sudanic 



E 



I 



J 



L 



G Berta 
H Kunama 
I Koman 
j Gumuz 
K Kuliak 



Note that "Outliers" and "Satellites" do 
not constitute families. 



L Kado 

Changes from Greenberg 1963: 

Place Songay (Songhai) , Saharan, and Kuliak (Nyangiya) as top 
branches coordinate with a large group called "Satellite-Core". 

"Chari-Nile" is broken up: East Sudanic goes into the Core 
Group and the others (Central Sudanic, Berta, Kunama) are Satel- 
lites, coordinate to the Core and to each other) . 

Maba and For (Fur) are also Satellites. 

Combine East Sudanic, Koman, Gumuz, and Kado into a Core Group. 
Kado is a new branch: it is Greenberg's "Tumtum" (1963: 149), 
which he stated diverged considerably from the other Kordofanian 
(of "Niger-Kordofanian" ) languages he grouped it with. 

Within East Sudanic: delete ElO "Nyangiya" (Greenberg 1963: 128 
note 3 expressed a reservation about Nyangiya belonging to E.S.); 
divide into two sub-families consisting of Ek: El, 3, 5, 7 and En: 
E2,4,6,8,9 respectively; these are based on retention of k in 1st- 
person pronoun in Ek and innovatioon of n in En (though E5 and E6 
remain problematical) in their placements). 

(I am omitting minor changes which apply to small groups or 
individual languages) . 

There are two main reasons I feel that Greenberg got such good 
results despite the flawed data base and analysis: 

(1) He made use of some morphological innovations of the kind 
which appeal to orthodox comparative/historical (CH) linguistic- 
ians. To cite only a very few examples: the N/S pronoun pattern 
lst/2nd/3rd person a/i/e and (more often in possessives) a/o/e; 
sg. N/pl . K (already suggested by Bryan- v. Tucker and Bryan 1966; 
but I think their distribution is more limited than they or Green- 
berg believed); verbal causative in -t-. 

(2) More importantly, he identified a number of E.S. and N-S 
isoglosses in his "mass comparison" lists. I will consider the 
nine instances of Greenberg's most widespread items in E.S. ment- 
ioned in 3.2 above and then nine found in five or all six of 
Greenberg's proposed divisions of N-S. 
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3.3.1 East Sudanic 

Referring to Greenberg's East Sudanic list (1963: 95-108), in 
citations I give representatives of the various forms he lists (he 
does not use *-forms or formulas) , sometimes with capital letters 
to represent variatons (e.g. K means g, k, possibly q ) . 

G#32 'cow' in 8 groups (El , 2 , 4 , 5 , 7 , 8 , 9 , 10 ) as t- or d- often 
with -N or -K. G also has this under his N-S #41 along with C 
Maba. My E.S. isogloss *t ei/taq. (Here, the slash divides Ek from 
En, my two E.S. sub-families containing El, 3, 5, 7 and E2,4,6,8,9 
respectively) . Of course this term may be more likely evidence of 
cattle-culture diffusion in East Africa rather than genetic relat- 
ionship (v. Bender 1982) . 

G#78 'mouth' (El , 3 , 4 , 5 , 7 , 8 , 10) in 7 groups as aK, qal , kul, 
aulo. Included partly in my Ek isogloss *aqgul and in N-S "Frag- 
ment" (not widely-enough distributed to be an isogloss) 'tongue' 
NaL- in families E and H. 

G#86 'rain' (also 'river, water, sky') in 6 groups (1,5, 7, 8, 9, 
10) as ar, korei, etc. (also included in Greenberg's N-S set as 
part of his #109) . My "excellent" N-S isogloss #2 *ar, found in 10 
families: ABK I CDFcGH I El . (Here, the I separates "Outliers" from 

"Satellite-Core" from "Core" in that order; Fc is a family and is 
one of two sub-divisons of Central Sudanic, the other being Fp, 
not a genetic group in itself) . 

G#12 6 'who?' in 6 groups ( 1 , 2 , 3 , 5 , 7 , 8 ) as na ~ qa (and under 
N-S as #152 ji(i)a). Not included in my lexical comparisons, but in 
grammatical study (1991: 12) found as -q- in CFG I E; thus it would 
be a possible isogloss for Satellite-Core (S-C) . 

G#4 'arrive' (also 'come') in 5 groups (El, 2, 3, 5,9) as TVR 
(also included in G ' s N-S as part of his #8; his inclusion of 
Gumuz tona is not justified). My "fair" N-S isogloss #164 tOr+ , 
found in B I FpH | EkL . 

G#59 'hand' in 5 groups (El, 2, 3, 5, 8) as ad, ed. My Ek isogloss 
*at (with citations also in En) . 

G#61 'head' in 5 groups (El, 3, 4, 7, 9) as ur, o 1, kele. My E.S. 
isogloss *Ur/01. 

G#65 'house' (also 'here, there, place') in 5 groups (El, 4, 6,9, 
10) as ka, kwi , wee, oik. Also found as part of G's N-S #78. I do 
not accept this one on phonological and semantic grounds, having 
only a Fragment in N-S of form wai in C Maba and an isogloss *wVl 
in Ek. 

G#117 'tooth' in 5 groups (El , 2 , 3 , 4 , 10 ) as ni(gi)T. My E.S. 
isogloss */ii + T. 

To summarize, it is clear that Greenberg's findings for these 
nine items are a good start, taking into account the data avail- 
able at the time and the fact that N-S was only part of a much 



- 9 - 



TE STING MULTILATERAL COMPARISONS IN AFRICA 



wider project. The result certainly fulfills the pre-theoretical 
task of indicating East Sudanic as a probable family to be invest- 
igated rigorously. See Ross 1991 and Bender et al . Forth. 

More generally, perhaps this result implies that MLC can work 
fairly well as a pre-test for families of moderate depth such as 
IE and E.S. (see 2, 3 above): eight of G's nine most extensive 
comparison sets coincide fairly well with cognate sets. But Nilo- 
Saharan is a greater challenge as we shall see. 

3.3.2 Nilo-Saharan 

Greenberg's N-S items (1963: 133-48) suffer from the unfortun- 
ate aberration of "Chari-Nile " . His N-S consists of six families: 
Songay, Saharan, Maba, Fur, "Chari-Nile", Coman. The last-named 
combines my Roman and Gumuz (plus "Mao " languages , which are now 
known to be Omotic- see Bender 1989b) . The numbers of families in- 
volved in G's 161 items are: two: 61, three: 68, four: 23, five: 

7, six: 2. The nine best cases (5 or 6 families included) are: 

G#81 'kill, die' in all six families as wi , wu, yeyi , etc. I 
have a Fragment (probably should be a weak isogloss in S-C) wi , 
iy , etc. found in AlCDFlI. 

G#87 'lightning' in all six families as mVl , mud-, bil. I do 
not accept this one, having only Fragment bEL in K|E. 

G#2 2 'blood, red' in all but D For as (K)eri, KVR . My "good" 
isogloss #40 *k+ar+ in ABlCFHlE and an overlapping item #323 of 
form *(k)ORi in ABlCFpGlEL and also in Mande (as Rali, Roli) and 
possibly in Proto-Niger-Kongo as *Rodi, Roli, in particular in 
Volta-Congo (see Williamson 1989 for classification) as kre, kila. 

G#26 'breast, chest' in all but D For as gani , akun, etc. My 
"good" isogloss #45 *kin+t ~ kun+t found in BK|CFH|EJ. 

G#61 'fire' in all but A Songay as azza, su, udu, ito, woti, 
etc. This item is divided between my "fair" isogloss #159 *-SI in 
BlCFlIL and Fragment wut, od, etc. in D|I. 

G#65 'go, walk' in all but D For as KV. This is divided between 
my "good" isogloss #34 *ga(w) o in ABK I F 1 1 and #315 *ka in ABlCHlI, 
also found in Mn and Volta-Congo as ka, ko. 

G#88 'lion, leopard' in all but I "Coman" as mVr, muddu . I have 
this as a "symbolic" item #270 *mEr in ABKICDGHI-, based on the 
possibility of feline sound -symbol ism, although I myself find this 
unlikely. (I also have a possibly symbolic S-C isogloss #272 
*p-a(u)+ for the same meanings). 

G#95 'mother' in all but B Songay as ya . My #278 *ya in 
BKlCDFHlEJL, is considered to be a symbolic (nursery) term, again 
without much conviction. 

G#109 'rain' in all but C Maba as hari, war, koro. This is the 
extension of Greenberg's E.S. #86 and is my #2 as above under E.S. 
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3.3.3 Conclus ions 

The Nilo-Saharan set of Greenberg's is more problematical than 
the East Sudanic one, but still includes enough of substance to 
make his Nilo-Saharan worth pursuing. 

In fact, this is just what I have been doing. The degree of 
overlap of our results is impressive, especially when one consid- 
ers that I have not consulted Greenberg's particular proposed E.S. 
and N-S forms in any extensive or systematic way in all the years 
of my work. None of the above items from Greenberg looked familiar 
to me when I tracked them down for this paper. 

However, by relying on "mass comparison" and scorning regular- 
ity and reconstruction, Greenberg committed many errors. Examples 
are his using his E.S. #4, 32, 65, 86, 126 as evidence for both 
E.S. and N-S and doubtful judgments involved in the makeup of his 
E.S. #65 and 78 and N-S #61 and 87. 

The "mass comparisons" missed most of my 16 "excellent", 69 
"good" and 88 "fair" N-S isoglosses. Consider only the "excellent" 
ones: those which include representation in all four branches of 
N-S: A Songay, B: Saharan, K: Kuliak, and S-C Satellite-Core (see 
diagram in 4.3. above). (I have been very conservative in reject- 
ing other potential "excellent isoglosses" because of possible 
symbolism or diffusion). I give these with only main glosses. 

#1 'belly, intestines' *ar in ABK I CH I EkJ 
#2 'rain, river' *ar in ABK I CDFcGH I El 
#3 'work, make, change' *bEr in ABK|CH|IJL 
#4 'stick, spear, bow' *bEr in ABK I CDFG I EIL 
#5 'wing, neck' *bi ~ bo in ABK|CG|EnIJ 
#6 'many, big' *bo in ABK I FH I Enl J 
#7 'ashes, earth' *bo/an in ABKICHIEJ 
#8 'rib, side, horn' *der in ABK I DGH I - 
#9 'brother, man' *er in ABK I CD I EIL 
#10 'follow, hunt' *kor in ABK I CH 1 1 J 
#11 'elbow, foot, finger' *kor2 in ABK I CDGH I EIL 
#12 'horn, bone, rib' *k+Ob in ABK I CDFp I EnIL 
#13 'lake, river, well' *kuR in ABK I CDF I EL 
#14 'say, ask, count' *nV in ABK I CDFG I EJL 
#15 'many, all' *Pat in ABK I CDGH I L 
#16 'fall, return' *tl+t in ABK I CDFH 1 1 JL 

Of these, only #2 appeared in the discussion of 3. 3. 1-2 above 
(as G#86 under E.S. and G#109 under N-S). It is very revealing 
that very few of the above isoglosses are reflected in Greenberg's 
three lists of "mass comparison" items: East Sudanic (1963: 95- 
108), "Chari-Nile" (ibid. 117-127), Nilo-Saharan (ibid. 133-148). 
These are pieces of my #4 in "Chari-Nile" #90 (Nilotic and Kuna- 
ma) , pieces of my #5 in C-N#3 and N-S #5 (Saharan, Maba, Berta, 
Didinga, Koman) , pieces of my #7 in C-N#9 and N-S#9 (Songay, Ber- 
ta?), pieces of my #9 in E.S. #71 and N-S#91 (Songay, Saharan, 
Nubian, and Nilotic) . There are single instances elsewhere also. 

The conclusion that the "method of resemblances" (Heine 1972) 
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or "mass comparison" or "multilateral comparisons" is doomed to 
failure as other /than a pre-theoretical first step: it results in 
missing most of the good candidates for isoglosses by jumbling to- 
gether parts of real isoglosses and items wrongly judged to be 
cognates on the similarity basis. For one example of the latter, 

G's N-S#9 includes my #7 and also my #336 bUr, not an N-S iso- 
gloss, being found widespread also in Afrasian and in Mande . 

Another interesting way in which MLC misses the boat is that it 
fails to reveal such interesting phenomena as that illustrated in 
item #11 above, namely the regular correspondence sets with pat- 
terning according to sub-classification. In #11, r 2 stands for a 
proposed proto-phoneme which is realized as r/r ~1/1 in the modern 
languages (recall that the notation means Outliers/Satelites/ 

Core) . My analysis revealed also similar d 2 > t 2> and e 2 (see also 

02 or o/a in #7 above) . 

To conclude, Greenberg's Nilo-Saharan work succeeded not be- 
cause of but despite his espousal of "Multilateral Comparison" . 

It included a large-enough data base and enough sound judgments to 
lead him to the right outline of N-S despite his rejection of reg- 
ular correspondences and reconstruction and the slipshod appear- 
ence of much of the supporting presentation. I believe this con- 
clusion would also apply to the Niger-Congo (=Niger-Kordofanian) 
and the Afrasian (=Afroasiatic) work. I reserve judgment on the 
Khoisan work because I have no expertise in that area. This is an- 
other illustration of what Newman (1974: 648) refers to as being 
able to recommend the cook but not the cookbook! 
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Notes 

1 Note that this is not exactly parallel to tossing a coin or 
rolling dice. We cannot assume that there are a fixed number of 
forms which is the same at each trial like the six possibilities 
for one die. For each language, the forms are largely different 
from each other language (if not, there would be little point in 
comparing them) and we are looking for how many times forms from 
one language are similar enough to that in others to be considered 
a "match". This is like looking for how many l's, or 2 ' s . etc. are 
rolled in 12 trials with a die except that of course all l's are 
not similar but actually identical. See the attached chart for how 
this works for the first two trials. 
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Handout for M. L. Bender: Testing Multilateral Comparisons 
in Africa (LSA Meeting Jan. 5, 1995) 

Assume 12 languages abbreviated as A, B, C,...L; 100-item lists 
for each, excluding loans, compounds, etc. Examine item n on the 
list in A, B, etc. in turn. Call the form for item n in language A 
a n , etc. 

Assume that the MLC "measure of similarity" from language to 
language is a modest 5% (Greenberg 1963: 3 refers to it as "ac- 
cident [sic] resemblances between two languages" and in his ex- 
ample uses 20% to make his case stronger) . It is clear from con- 
text that he means this to apply to any two languages under con- 
sideration) . 

Now develop the branching tree from left to right for (judged 
to be) "same" (5% probability) or "different" (95% probability). 

We run into a problem after the second language: the third item 
may be judged "same" or "different" as either the first (in lang- 
uage A) or second one (in language B) . This is unlike throwing a 
die for which the outcome is simply the number which turns up. 

To develop the tree we have to assume transitiivity for "same": 
if a is "same as" b and b is "same as" c, then a is "same as" c. 

But this does not hold for "different from" since a can be "diff. 
from" b and b "diff. from" c, but a and c can be "same"! 

Lang .A B C 



Item a n -- 



same .05 

I a n/ a n/ a n 

same .05 

a n/ a n | 

diff. .95 

a n/ a n/ c n 

same .05 

a n/b n /b n 



diff. from b n (.95), same as a n (.05) 
a n /b n / a n Prod. .045125 

diff. from both a n , b n (.95 x.95) 
a n /b n / c n Prod. .8575375 

Sum: 1.00 Sum 1.00 



diff. .95 — 

a n/b n I 



Product: .0025 

Product: .0475 

Product: .0475 



.The two extremes of the chart are simple. Greenberg's case is 
the' top: probability of all forms agreeing. The bottom is the case 
I discuss on p. 2: all items are different. Everything except the 
bottom is the case of at least one match. The problem arises with 
the middle of the chart where there is exactly one match. Of 
course the middle gets more and more complicated as the comparis- 
ons are extended to four, five, etc . languages and there can be 
exactly one, two, three, etc. matches. 
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